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i 

Summary 

) 

A  three-parameter  generalisation  of  the  beta-binomial  distribution  (BBD)  is 
derived  and  examined.  We  obtain  the  maximum  likelihood  estimates  of  the  param¬ 
eters  and  show  that  the  regularity  conditions  for  asymptotic  efficiency  are  satisfied. 
To  exhibit  the  appUcablity  of  the  generalised  distribution  we  show  how  it  gives  an 
!  improved  fit  over  the  BBD  for  magasine  exposure  and  consumer  purchasing  data. 

i 


Finally  we  derive  an  empirical  Bayes  estimate  of  a  binomial  proportion  based  on 
the  generalised  beta  distribution  used  in  this  study. 


1.  Introduction 


Suppose  an  advertiser  is  about  to  launch  an  advertising  campaign  by  placing 
one  advertisement  in  k  successive  issues  of  a  magazine.  His  objective  is  to  estimate 
the  proportion  of  the  population  which  sees  none,  one,  or  up  to  all  k  of  the  advertise¬ 
ments.  We  say  that  an  individual  reading  a  magazine  is  exposed  to  an  advertisement 
when  he  or  she  sees  the  ad.  Let  X  be  the  number  of  exposures  and  P  the  probability 
an  individual  is  exposed  to  any  one  ad.  The  distribution  of  X  is  called  the  expo¬ 
sure  distribution  (e.d.).  It  is  reasonable  to  assume  that  X\P  ~  bin(k,p )  if,  given 
P  =  p,  readership  of  successive  issues  is  independent  (Ehrenberg  1975  discusses 
this  assumption  in  some  detail).  Now  let  P  be  a  random  variable  having  a  beta 
distribution.  The  beta  distribution  is  particularly  attractive  since  it  can  assume  so 
many  shapes.  By  compounding  the  binomial  with  the  beta  distribution  we  obtain 
the  BBD.  It  is  also  known  as  the  negative  hypergeometric  distribution  (Johnson 
and  Kots  1969).  The  mass  function  of  the  BBD  is 

,B,y  _  .  _  fk\  r(c  +  0)  V(k-z  +  0)  r(g  +  a) 

/  (A  -  i|  -  ^  r(o  +  0  +  r(/J)  r(a)  ’ 

x  =  0, 1, . . . ,  fc,  a  >  0,  /?  >  0, 
where  T(/  +  1)  =  lT(l)t  the  usual  gamma  function. 

The  BBD  was  first  derived  by  Skellam  (1948).  Multivariate  generalizations  of 
the  BBD  were  studied  by  Ishii  and  Hayakawa  (1960).  The  BBD  has  been  success¬ 
fully  applied  to  estimating  the  e.d.  of  media  schedules  with  one  magazine  (Mether- 
ingham  1964;  Chandon  1976),  television  schedules  (Rust  and  Klompmaker  1981), 
consumer  purchasing  behaviour  (Morrison  1979)  and  as  an  indicator  of  television 
loyalty  (Sabavala  and  Morrison  1977).  In  addition  the  BBD  has  been  used  to  es¬ 
timate  the  distribution  of  household  disease  (Griffiths  1973)  and  proportions  with 
extraneous  variance  (Kleinman  1973;  Moore  1987). 

Many  people  subscribe  to  one  or  more  magazines.  Among  them  is  a  proportion 


which  always  reads  a  particular  magazine.  Chan  don  (1976)  suggests  a  modification 
of  the  BBD  which  he  calls  the  “two  segment  betarbinomial  model".  One  segment 
is  definite  readers  and  the  other  is  probable  readers.  In  Section  2  we  will  derive  a 
generalisation  of  the  BBD  based  on  the  two  segment  model  introduced  by  Chandon 
and  develop  it  further.  In  Section  3  we  will  look  at  maximum  likelihood  estimation 
of  the  parameters  of  this  new  distribution  then  prove  that  regularity  conditions  for 
these  estimates  to  be  asymptotically  efficient  are  satisfied.  In  Section  4  we  will  give 
some  examples  where  the  new  distribution  gives  an  improved  fit  over  the  BBD  for 
magazine  exposure  data  and  consumer  purchasing  behaviour.  Finally,  in  Section  5 
we  will  use  the  generalised  BBD  to  obtain  an  empirical  Bayes  estimate  of  a  binomial 
proportion.  This  estimate  will  be  applied  to  simulation  of  magazine  e.d.s 

2.  Generalisation  of  the  BBD 


Let  w  and  1  -  u  represent  the  proportion  of  definite  and  probable  readers, 
respectively.  The  parameter  w  may  be  viewed  as  a  loyalty  factor  since  a  high  value 
of  w  indicates  an  appreciable  reading  loyalty  whilst  a  low  value  indicates  little  or 
no  loyalty  to  a  particular  magazine. 

To  incorporate  this  reading  loyalty  proportion  we  change  the  distribution  of  P 
from  a  beta  distribution  to  a  beta  distribution  mixed  with  a  distribution  degenerate 
at  p  =  1.  The  cumulative  distribution  function  of  P  is  now 


Fp{p)  =  U  ~u)f  » 


I o  r(«)iw 

0  <  p  <  1,  a  >  0,  0  >  0,  0  <  w  <  1. 


(2.1) 


When  F*{p)  is  compounded  with  the  bin(k,p)  distribution  we  obtain  the 
modified  BBD  (MBBD)  with  mass  function 

fMB/X  =  x)  =  (1  U)(k}  r(a  +  ^)  r(*~-Z+ff)  r(*  +  a)  +u>I{x»fc}  , 

/  x)  (1  wJ\^/r(a  +  /?  +  fc)  r(/J)  r(o)  +  (*«*>’ 

*  =  0, (2.2) 


When  w=0  the  MBBD  reduces  to  the  BBD. 


We  derive  the  factorial  moments  of  the  MBBD  using  knowledge  of  the  factorial 
moments  of  the  BBD  and  linearity  of  the  expectation  operator.  They  are, 


m<  =  (1  —  w) 


*1  r(a+p  r(a+jg) 

(*-/)!  r(a)  r(«+0+/) 


+ 


k\ 

(k  -  /)!W  * 


I  =  1,2...  . 


In  particular, 


*(*)=.* 


a  +  uifi 

a  +  0 


and 


var(X)  =  (1  -  w) 


[ka0(a  +  ft  +  k)  +  k^uP2{a  +  0  +  1)| 
(or  +  P)*(a  +  /?  +  !) 


(2.3) 


3.  Parameter  Estimation 

Chandon  (1076)  estimates  o,  0  and  u>  by  equating  the  sample  proportion  of 
nonreaders  to  the  proportion  of  nonreaders  given  by  the  MBBD  model  for  k  = 
1,2,3.  In  Section  4  it  will  be  seen  that  we  do  not  have  these  data  at  our  disposal. 
Furthermore,  Chandon's  method  does  not  produce  estimates  having  any  optimality 
properties  such  as  being  BAN  and,  in  addition,  suffers  from  inconsistencies  which 
sometimes  force  him  to  set  w  =  0,  thereby  losing  any  advantage  of  using  the  MBBD. 

We  will  estimate  a,  0  and  u  using  maximum  likelihood. 

Let  ih  be  the  number  of  people  in  the  sample  (of  sise  n  »  n<)  which  see  t 

out  of  k  issues  of  a  magasine,  »  =  0, 1  ,...,*.  We  will  see  in  Section  4  how  these 
data  are  obtained  from  a  media  sample  survey. 

Define  c  =  c{at0)  =  [r(a  +  *)r(a  +  0)]  /  [r(a)r(a  +  0  +  k)). 


Then  the  likelihood  equations  are 


<9  log  L 
da 


k- 1 

=  53  Ajfa,*)*  -  (»  -  n*)Ai(a  +  0,k) 

imi 


dbg  L 
90 


(l-(i>)e{A1(a,k)-  Ai(a  +  0tk)]nk 
u  +  (1  —  w)e  ’ 

fc-1 

ss53Al0J**-On<  -  (n-n*)At(o  +  0,k) 

im 0 


_  (1  -(*>)<  At  (a  +  0,k)rik 
u  +  (1  -  w)e  ' 
dlogL  -(n  -  ilk)  +  (1  -  e)n> 

du  1—u  w  +  (1  -  u)e  * 


where  A^,/)  *  Eyil/fr  +  i)  • 

By  equating  0  log  L/dw  to  aero  we  get 


Q  -  nt/»-t(M) 

1  -  «(4,3) 


Substitution  of  u  into  the  fint  order  partial  derivatives  results  in  the  following 
second  order  partial  derivatives 

"<"i~-n‘/‘l(A,(a't)  ■  A.(«  +  />.*))’/(!  -  C)  +  A, (a  +  /»,*) 

k-l 

-  Aa(a,*)|  -  53  Aa(o,0n<  +  (n  -  **)Aa(o  +  0tk), 

iml 

~i"i~-"e)‘>C|(Al(a  +  'a'*)  -  M«.*))A,<« +  />.*)/<!-«) 

+  Aa(o  +  /J,k)|  +  (n  -  n*)Aa(a  +  /f,k), 

?~wL  ~irr^£|(Ai(a + f,k)),,{1  -  e) + A,(a + A4)| 

k-l 

-  53  Aa(0,»-  i)#w  +  (n  -  n»)Aa(o  +  0,k), 

im 0 


where  Aa(7,J)  =  -^Aifr,/). 


The  MBBD  likelihood  equations  have  no  closed-form  solution  but  may  be  solved 
by  the  Newton- Raphson  method  using  the  above  partial  derivatives.  Since  cD  is  an 


A 

explicit  function  of  a  and  p  the  numerical  work  is  considerably  reduced. 

In  Section  4  we  will  define  some  criteria  by  which  the  effectiveness  of  an  adver¬ 
tising  campaign  is  judged.  These  criteria  are  functions  of  9  =  (a,  /?,  w).  To  obtain 
asymptotic  variances  for  the  criteria  estimates  we  first  need  the  asymptotic  joint 
distribution  of  9  =  (a,  0,  w). 

Consistency  and  asymptotic  efficiency  of  the  MLEs  can  be  established  using 
the  multiparameter  discrete  distribution  version  of  theorems  utilized  in  Giesbrecht 
and  Kempthome  (1976)  and  proved  in  Kulldorff  (1957).  The  statement  of  Kull- 
dorff’s  theorem  has  been  tailored  somewhat  to  suit  the  three-parameter,  discrete 
distribution  case. 

Theorem  (Kulldorff  1957).  Let  /{  =  f(X  =  »).  The  parameter  space  of  9  is  denoted 
fl  and  is  an  open  ball.  If  the  following  regularity  conditions  are  satisfied: 
i)  and  ^  «dst  for  every  9  6  fl  ,  *  =  0, . . .  k,  j  =  1,  2,  3; 

Ef.o^=o  for »n 

iii)  The  information  matrix  7(6)  is  positive  definite; 

iv)  There  exist  numbers  {Hi}  (independent  of  the  parameters  except  possibly  the 
true  parameter  values)  and  a  positive,  twice  differentiable  function  g{9)  such  that 


a2 


<Hi 


1  iS  M  /  /  *s  —  3, 


for  all  parameter  values  €  fl  where  UHi  <  oo  ; 

then  9  is  unique  and  y/n{f  -  6)-^*  MVN(6 ,  [/($)] _l)  as  n  -»  oo.  0 


For  the  MBBD  let  F  =  (a,0,w),  0  =  (0,oo)  x  (0,oo)  x  (0, 1)  and  U  =  f?B. 
We  will  now  show  that  the  above  conditions  are  satisfied. 

i)  As  ft*8  is  a  ratio  of  polynomials  in  6,  it  is  clear  that  all  third  order  partial 
derivatives  of  log  f^B  exist  on  fl. 


ii)  We  have  /»WB  =  1  so  these  two  conditions  are  satisfied  if  we  can  inter¬ 
change  the  order  of  differentiation  (w.r.t.  0})  and  summation.  As  the  sum  is  over 
only  a  finite  number  of  terms  and  the  second  order  derivatives  exist  (by  (i))  this 
interchange  is  valid. 

iii)  The  3x3  information  matrix  1(6)  has  elements 

hl(f)  =(x  -  w)  -  Ai(a  +  0,  k)  I  Vf 

»*0 

ca(l  -wfflAtfafc)  -  Ai(q  +  fl,fc))a 
(l  —  w)c  +  w  * 

fc-1 

/„(*)  =(l-w)£[A,(0,*-«)  -A,(a  +  /J,*)]a/iB 

imO 


/»(«)  =1 


(1  —  u>)e  +  u  ’ 

1  -  e 


((1  -  w)e  +  w](l  -w)’ 

Jk-1 

/«(#)  =(1  -  <•>)  £[A, (<*.*)  -  +  0tk)\[Ai{fitk  - 1)  -  A, (a  +  0,k)]f?  , 


»«0 


<^(1  -  w)»IAi(«,*)  -  A, (a  +  /?,t))A,(a  +  <9,t) 

(1  —  w)e  +  w  * 

wA  =?lA*.te»fc)  rMa±l'M 

i9{J  (l-w)c  +  w 

(l-u/Jc  +  w' 

The  proof  that  I{ff)  is  positive  definite  when  k  >  3  is  somewhat  messy.  The  full 
details  are  given  by  Danaher  (1987).  The  information  matrix  is  singular  when  k  <2 
as  this  corresponds  to  the  situation  where  we  have  more  parameters  to  estimate  than 
we  have  data. 


iv)  The  only  problem  points  are  when  a  +  0  is  near  0  or  when  w  is  near  1.  One 
possible  g  which  fulfills  this  requirement  is  g(fj  =  (1— &/)e”(l/a+l/*).  This  function 
is  suitable  since  £p(0),  jg(9),  and  all  requisite  derivatives  tend  to  sero  as 

a,  0,  and  a  +  0  tend  to  0.  In  addition  g(f)  eliminates  any  problems  when  w  =  1. 


It  follows  from  Kulldorff  (1957)  that  the  MLEs  are  best  asymptotically  normal 
with  covariance  matrix  /-1(tf). 

Kulldorff ’s  (1957)  results  can  also  be  applied  to  the  BBD  to  show  the  MLEs 
of  a  and  0  are  consistent  and  asymptotically  normal,  something  Kleinman  (1973) 
stated  he  was  unable  to  do.  Minor  modifications  to  the  above  method  are  necessary; 
for  example,  let  u>  =  0  in  j?(d). 

4.  Applications 

The  survey  data  we  will  use  here  comes  from  the  AGB:  McNair  Surveys  New 
Zealand  Ltd.  “National  Media  Survey”  of  5201  residents  of  New  Zealand  conducted 
in  1985.  In  the  survey  two  of  the  questions  asked  of  the  respondents  were  (for  weekly 
magazines); 

Ql)  “Have  you  personally  read  or  looked  into  any  issue  of  ...(magazine  name) 
in  the  last  seven  days  -  it  doesn’t  matter  where?*  (Has  a  Y/N  answer). 

Q2)  “How  many  different  issues  of  ...(magazine  name),  if  any,  do  you  personally 
read  or  look  into  in  an  average  month  -  it  doesn’t  matter  where?*  (Has  answer 
0,1, 2, 3, 4  issues). 

The  wording  of  Ql  and  Q2  are  modified  appropriately  for  fortnightly,  monthly 
and  two-monthly  magazines.  These  questions  were  asked  for  forty  different  maga¬ 
zines. 

An  implicit  assumption  in  the  magazine  advertising  field  is  that  a  person  who 
reads  a  magazine  is  exposed  to  all  the  advertisements  in  that  magazine.  This  is 
unlikely  to  be  true  for  people  who  meet  the  criterion  of  “read”  in  Ql  and  Q2. 
However,  it  is  usually  impractical  to  ask  respondents  which  advertisements  they 
have  been  exposed  to  so  we  cannot  avoid  making  this  assumption  for  the  available 
data. 

The  parameter  estimates  for  the  National  Business  Review  are  given  in  Table  1. 


The  estimate  of  u>  tells  us  that  1.7%  of  the  respondents  always  read  this  magazine. 
From  Ql  we  can  estimate  that  in  any  particular  week  2.6%  of  the  population  will 
read  the  National  Business  Review.  This  implies  that  of  the  2.6%  who  read  this 
magazine  in  any  particular  week  1.7/2.6  =  65.4%  read  it  every  week.  This  gives 
the  National  Business  Review  a  high  readership  loyalty,  something  well  known  by 
its  publishers. 

Let  e*  be  the  estimated  number  of  people  who  have  i  exposures.  The  e»’s  in 
Table  1  come  from  substituting  the  estimated  parameters  of  Table  1  into  (2.2). 
Then  the  Pearson  x3  goodness  of  fit  statistic  is  defined  to  be  x3  =  ]C<=o(a*  “ 
e»)3/«»  =  ]C^=o  c*  (»ay) •  We  can  interpret  c,-  as  the  contribution  to  x3  from  the 
ith  exposure.  In  this  case  the  x2  goodness  of  fit  statistic  for  the  BBD  is  significant 
(p-value<  0.001)  but  for  the  MBBD  it  is  not  significant  (p-value>  0.1).  The  e,’s  for 
the  two  distributions  (values  are  in  parentheses  next  to  expected  frequencies)  show 
that  a  considerable  improvement  in  accuracy  has  been  made,  particularly  for  three 
exposures. 

Table  1  also  gives  the  likelihood  ratio  test  for  Ho  :  w  =  0  vs.  Hi  :  w  >  0.  It 
shows  that  w  is  significantly  nonzero  (there  is  1  df  for  this  test). 

The  important  goodness  of  fit  criterion  to  an  advertising  agency  is  not  the  x3 
statistic  (Naples  1979).  They  measure  the  closeness  of  the  fit  by  three  other  criteria. 

The  first  is  reach,  which  is  defined  as  the  proportion  of  the  population  which 
is  exposed  to  at  least  one  of  the  advertisements,  i.e.,  1  -  fM  B(X  =  0).  The  second 
criterion  is  effective  reach ,  the  mean  of  the  e.d.  The  third  criterion  is  single  issue 
reach ,  the  proportion  of  the  population  exposed  to  any  one  issue  of  a  magazine. 

It  can  be  seen  in  Table  1  that  for  the  three  criteria  above  the  MBBD  produces 
estimates  closer  to  the  observed  values  than  the  BBD. 


Table  1:  Readership  data  for  the  National  Business  Review  showing 
the  fits  for  the  5BD  and  the  MBBD.  Sample  size  =  5201. 


Number  of 

Observed 

ExDected  Freauencv 

Exposures 

Frequency 

BBD  (c») 

MBBD  (cO 

0 

4961 

4961.3  (0.00) 

4960.9  (0.00) 

1 

90 

67.9  (7.19) 

95.5  (0.32) 

2 

43 

43.4  (0.00) 

35.3  (1.68) 

3 

12 

42.5  (12.89) 

15.3  (0.71) 

4 

95 

85.8  (0.99) 

95.0  (0.00) 

Parameter 

6=0.012 

6=0.024 

Estimates 

4=0.372 

4=2.113 

w=0.017 

X3  Goodness  of  Fit 

30.0 

2.7 

d.o.f. 

2 

1 

Likelihood  Ratio 

Test  Statistic 

18.8 

Reach  % 

4.614 

4.616 

Effective  Reach 

0.114 

0.120 

0.114 

Single  Issue  Reach  % 

2.59 

3.01 

2.84 

Of  course,  it  is  expected  that  the  addition  of  a  parameter  will  make  the  model 
more  flexible  and  hence  improve  the  fit.  Notice,  however,  that  the  shape  of  the 
distribution  of  P  for  the  MBBD  is  different  from  that  of  the  BBD,  as  shown  in  Figure 
1.  The  essence  of  PMB  is  a  reverse  J-shape  distribution  with  a  jump  at  p  =  1.  This, 
empirically  and  intuitively,  is  a  better  distribution  to  allow  for  magazine  reading 
loyalty.  On  the  other  hand  PM  assumes  a  U-shape  which  puts  too  much  weight  on 
the  probability  of  three  exposures,  a  property  not  consistent  with  the  data. 

Assuming  our  MBBD  model  is  correct  we  can  use  (2.1)  and  (2.3)  to  write  down 
expressions  for  reach  (p): 

1  _  fMB(X  =  o)  =  x  _  (i  _  u)  r(a  +  ft)  £{£  +  *)  ; 

1  '  *  'T(a +  /?  +  *)  r{p) 


and  effective  reach  (pe): 


Pe  =  k 


a  +  u0 


ye  —  ™  . o’ 

a  +  0 

Since  both  p  and  pe  are  differentiable  functions  of  (a,  P,  w)  we  can  obtain 
asymptotic  variances  for  p  and  pe  using  the  delta  method  and  the  information 

a 

matrix  I(0)  given  in  Section  3.  The  asymptotic  95%  confidence  interval  for  p  for 
the  data  in  Table  1  is  [4.03,  5.20]%  and  the  95%  asymptotic  confidence  interval  for 
pe  is  [0.098,  0.130]. 

Just  because  the  MBBD  fit  is  better  for  one  magazine  it  does  not  mean  that 
it  is  always  better  than  the  BBD.  We  compared  the  fit  for  all  forty  magazines  in 
the  survey  which  covered  the  entire  spectrum  of  entertainment  magazines  through 
to  computer  magazines.  The  average  absolute  error  between  estimated  reach  and 
sample  reach  for  the  forty  magazines  is  0.030%  for  the  BBD  and  0.036%  for  the 
MBBD.  The  average  absolute  error  between  estimated  effective  reach  and  sample 
effective  reach  is  1.36%  for  the  BBD  and  0.28%  for  the  MBBD.  The  average  absolute 
error  between  estimated  single  issue  reach  and  sample  single  issue  reach  is  0.44%  for 
the  BBD  and  0.36%  for  the  MBBD.  Summing  up,  the  MBBD  error  is  marginally 


worse  than  the  BBD  error  when  estimating  reach  but  the  MBBD  is  dearly  better 
at  estimating  the  effective  reach  and  single  issue  reach.  Overall  it  is  fair  to  say  that 
the  MBBD  gives  an  improved  fit  over  the  BBD  for  magazine  e.d.  data. 

The  MBBD  can  also  be  utilized  as  a  marginal  distribution  when  estimating  the 
reach  and  effective  reach  of  advertising  schedules  in  higher  dimensions  (Danaher 
1987).  In  addition  the  MBBD  has  applications  not  only  for  magazine  readership 
but  also  to  television  viewership  and  to  newspapers,  whose  readership  exhibits  a 
high  level  loyalty. 

The  MBBD  need  not  be  restricted  to  fitting  media  exposure  data.  If  we  have  a 
proportion  of  the  population  which  always  behaves  in  a  specified  way  whilst  the  rest 
of  the  population  has  a  certain  probability  of  behaving  in  the  specified  way  then  the 
MBBD  should  be  considered  as  a  possible  model  instead  of  (say)  the  BBD.  Such  a 
situation  arises  in  consumer  purchasing  bevaviour,  as  the  following  example  shows. 

Morrison  (1979)  uses  some  purchase  intention  data  of  Juster  (1966)  in  which 
Juster  asks  respondents  to  rate  their  purchase  intentions  for  autos  and  appliances 
on  a  scale  from  0  to  1  in  0.1  gradations.  Zero  is  for  no  intention  and  1  is  for  an 
almost  certain  purchase  (see  Morrison  (1979)  for  details).  A  follow-up  study  was 
conducted  in  which  Juster  asked  the  respondents  if  they  actually  bought  an  auto  or 
appliance.  Morrison  constructs  a  model  to  predict  actual  purchase  behaviour  from 
stated  purchase  intention  in  which  he  uses  the  BBD  to  fit  the  intention  data. 

The  data  have  the  following  characteristics:  a  large  group  have  no  purchase 
intention,  some  people  have  a  probable  purchase  intention  and  some  people  are 
certain  to  purchase  in  the  future.  Owing  to  the  nature  of  people’s  intentions  as 
revealed  by  Juster’s  data  the  MBBD  is  a  good  distribution  to  use  to  fit  the  data 
instead  of  the  BBD. 

In  Table  2  we  see  that  both  the  BBD  and  MBBD  give  an  excellent  fit  to  the 


Table  2:  Purchase  intention  data  for  appliances  to  be 
bought  in  the  next  12  months.  Sample  size  =  2688. 


Intention 

Scale 

Observed 

Frequency 

Expected  Frequency 
BBD  MBBD 

0.0 

2377 

2373.5 

2377.0 

0.1 

87 

85.4 

90.7 

0.2 

57 

45.8 

47.7 

0.3 

29 

32.3 

32.9 

0.4 

23 

25.6 

25.5 

0.5 

22 

21.8 

21.0 

0.6 

21 

19.5 

18.1 

0.7 

14 

18.3 

16.1 

0.8 

11 

17.9 

14.8 

0.9 

17 

19.0 

14.1 

1.0 

30 

24.9 

30.0 

Parameter 

6=0.035 

6=0.038 

Estimates 

4=0.687 

4=0.873 

(9=0.006 

Xa  Goodness  of  Fit 

8.43 

wmm 

d.o.f. 

8 

7 

Likelihood  Ratio 

3.5 

Test  Statistic 

p- value 

0.065 

data  as  neither  of  their  x3  goodness-of-fit  statistics  is  significant.  The  x3  for  the 
MBBD  is  smaller  than  that  of  the  BBD,  for  which  the  likelihood  ratio  test  (with 
8-7  =  1  df.)  gives  a  marginally  significant  p-value  of  0.065.  Hence,  the  MBBD 
gives  a  better  fit  than  the  BBD  for  these  data.  A  parallel  study  to  Morrison’s  could, 
therefore,  be  done  using  his  method  but  replacing  the  BBD  with  the  MBBD. 

5.  Empirical  Bayes  Estimate  of  Single  Issue  Reach 

We  can  think  of  Ff?(p)  of  (2.1)  as  the  prior  distribution  of  an  individual's 
exposure  to  a  single  issue  of  a  magazine.  The  posterior  distribution  of  P\X  =  z  is 


-.  Under  squared  error  loss  the  Bayes  estimate  of  p  is  the  mean  of 


the  posterior  distribution  (Berger  1980).  If  we  estimate  the  parameters  a,  /?,  and  w 
from  the  data  we  get  the  empirical  Bayes  estimate  (Casella  1985).  The  empirical 
Bayes  estimate  of  single  issue  reach  under  the  distribution  of  P  in  (2.1)  is 


(  (a  +  4  +  *)’  *-°" 

(1  -  w)e(&,4)  + 


*  =  0, . . . ,  k  -  1 ; 


x  =  k. 


This  estimator  makes  a  great  deal  of  sense  since  if  z  =  0  in  the  survey  the  MLE  is 
Pmlb(0)  =  0,  which  implies  that  a  person  will  never  read  the  magazine,  whereas  a 
person  may  buy  the  magazine  on  impulse  or  glance  at  it  in  a  doctor’s  surgery,  for 
example.  On  the  other  hand,  if  z  =  4  in  the  survey  for  a  weekly  magazine,  the  MLE 
implies  a  person  always  reads  the  magazine  which  is  unlikely  to  be  true  since  various 
reasons  may  prevent  a  person  from  reading  a  particular  issue  of  a  magazine.  That 
is,  this  estimator  tends  to  moderate,  from  the  extreme,  an  individual’s  exposure 
probability. 

We  calculated  the  empirical  Bayes  estimates  of  single  issue  reach  for  the  No- 


tional  Business  Review  thus: 

{  0.163(x  +  0.024),  *  =  0, . . . ,  3  ; 

(5.2) 

0.9700,  x  *  4. 

Such  an  estimator  is  useful  in  simulation  studies,  for  instance,  if  it  were  required 
to  estimate  the  audience  for  a  schedule  which  combined  different  media  types  and  no 
exposure  model  were  available.  Suppose,  for  example,  a  schedule  has  4  insertions  in 
particular  magazine  and  three  insertions  in  a  particular  television  time  slot.  Extract 
the  response  to  Q2  (call  it  zmng)  for  the  magazine,  then  calculate  PuB(xm*o)-  Now 
simulate  4  Bernoulli  trials  with  probability  of  success  pn/p(xm„c).  Keeping  with 
the  same  individual  extract  that  person’s  probability  of  viewing  television  in  the 
desired  time  slot  and  conducted  3  Bernoulli  trials  as  before.  The  total  number  of 
successes  for  the  7  trials  is  the  individual’s  total  exposure.  Repeat  this  procedure  for 
each  individual  in  the  survey.  Gifford  (personal  communication)  used  simulation 
with  the  MLE  ( Pulb{ *)  =  x/lk)  of  single  issue  reach  to  estimate  the  audience 
for  a  combined  magazine/television  schedule  in  such  a  way.  Table  3  shows  the 
results  of  averaging  50  simulated  e.d.’s  for  the  National  Business  Review  using 
the  personal  probabilities  given  by  the  MLE,  the  BBD  empirical  Bayes  estimator 
(Pb(x)  =  [&  +  x)/{&  +  $  +  k)  =  .228(z  +  .012))  and  the  MBBD  empirical  Bayes 
estimator  (given  by  (5.2)). 

All  the  x2  statistics  are  significant  (p- value  <  0.01)  whereas  the  x2  for  the 
National  Business  Review  using  the  MBBD  model  (from  Table  1)  is  2.6  (p- value 
>  0.1)  so  that  simulation  methods  are  inferior  to  fitting  a  model  in  this  example. 
Nonetheless,  the  empirical  Bayes  estimate  of  personal  probability,  based  on  the 
MBBD  model,  gives  the  best  results  of  the  three  P(x)'*  used.  From  this  example 
we  infer  it  is  best  to  use  simulation  only  when  it  is  impossible  to  construct  an  e.d. 
model,  and  that  the  empirical  Bayes  estimate,  based  on  the  MBBD  in  (5.1),  is  likely 
to  give  better  results  than  simulations  based  on  the  MLE. 
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Table  3:  E.d.’s  obtained  by  averaging  50  simulated  e.d.’s  using 
Pb(x)  and  Pub(x)  as  estimates  of  personal  probability. 


Exposures 


0 

1 

2 

3 

4 

Observed 

4061 

47 

12 

95 

MLE 

4091.9 

48.4 

36.6 

21.9 

102.3 

42.2 

BBD 

4945.8 

101.7 

39.4 

43.9 

33.4 

MBBD 

4935.1 

132.3 

29.1 

16.2 

88.3 

21.9 

Acknowledgement 


This  paper  is  based  on  part  of  the  author’s  Ph.D.  dissertation  for  which  the 
the  author  gratefully  acknowledges  the  assistance  of  Professor  Duane  Meeter.  The 
author  would  also  like  to  thank  Ron  Stroeven  and  Clive  Gifford  of  AGB:  McNair 
Surveys  New  Zealand  Ltd.  Ron  provided  the  data  and  Clive  gave  helpful  insights 
into  techniques  for  simulating  magasine  exposure  distributions. 

References 

Be  ger,  J.O.  (1980).  Statistical  Decision  Theory.  Springer- Verlag,  New  York. 

Casella,  G.  (1985).  An  introduction  to  empirical  Bayes  data  analysis.  The  American 
Statistician.  39,  83-90. 

Chandon,  J-L.  J.  (1976).  A  comparative  study  of  media  exposure  models.  Ph.D. 
dissertation,  Northwestern  University. 

Danaher,  P.J.  (1987).  Estimating  multidimensional  tables  from  survey  data:  Pre¬ 
dicting  magasine  audiences.  Ph.D.  dissertation,  Florida  State  University. 

Ehrenberg,  A.S.C.  (1975).  Data  Redaction  Analysis  and  Interpretation  of  Statistical 
Data.  Wiley,  New  York. 

Giesbrecht,  F.  and  Kempthorne,  O.  (1976).  Maximum  likelihood  estimation  in  the 
three-parameter  lognormal  distribution.  J.  R.  Statist.  Soc.  B,  38,  257-264. 

Griffiths,  D.A.  (1975).  Likelihood  estimation  for  the  beta-binomial  distribution  and 
an  application  to  the  household  distribution  of  the  total  number  of  cases  of  a 
disease.  Biometrics.  29,  637-648. 

Ishii,  G.  and  Hayakawa,  R.  (1960).  On  the  compound  binomial  distribution.  Annals 
of  the  Institute  of  Statistical  Mathematics.  12,  69-80. 

Johnson,  N.L.  and  Kotz,  S.  (1969).  Discrete  Distributions.  John  Wiley  and  Sons, 
New  York. 

Juster,  F.T.  (1966).  Consumer  buying  intentions  and  purchase  probabilty.  J.  Amer¬ 
ican  Statist.  Assoc.  61,  658-696. 

Kleinman,  J.C.  (1973).  Proportions  with  extraneous  variance:  single  and  indepen- 

18 


dent  samples.  J.  American  Statitt.  Ateoc.  68,  46*54. 

Kulldorff,  G.  (1957).  On  conditions  for  consistency  and  asymptotic  efficiency  of 
maximum  likelihood  estimates.  Skandinavisk  Aktuarietidskrift.  40,  129*144. 

Metheringham,  R.A.  (1964).  Measuring  the  net  cumulative  coverage  of  a  print  cam¬ 
paign.  J.  of  Advertising  Research  4,  23-28. 

Moore,  D.F.  (1987).  Modelling  the  extraneoua  variance  in  the  presence  of  extra- 
binomial  variation.  Applied  Statistics.  31, 1,  8-14. 

Morrison,  D.G.  (1979).  Purchase  intentions  and  purchasing  behavior.  J.  of  Market¬ 
ing.  43,  65-74. 

Naples,  M.J.  (1979).  Effective  Frequency:  The  Relationship  between  Frequency  and 
Advertising  Awareness.  Association  of  National  Advertisers,  New  York. 

Rust,  R.T.  and  Klompmaker,  J.E.  (1981).  Improving  the  estimation  procedure  for 
the  beta  binomial  t.v.  exposure  model.  Journal  of  Marketing  Research.  18, 
442-448. 

Sabavala,  D.J.  and  Morrison,  D.G.  (1977).  Television  show  loyalty:  a  beta-  binomial 
model  using  recall  data.  Journal  of  Advertising  Research.  17,  35-43. 

Skellam,  J.G.  (1948).  A  probability  distribution  derived  from  the  binomial  distribu¬ 
tion  by  regarding  the  probability  of  success  as  a  variable  between  sets  of  trials. 
J.  R.  Statist.  Soc.  B,  10,  257-265. 


